Computer Vision and Image Characteristic Search

ABSTRACT

Computer vision and image characteristic search is described. The described system leverages visual search techniques by determining visual characteristics of objects depicted in images and comparing the determined characteristics to visual characteristics of other images, e.g., to identify similar visual characteristics in the other images. In some aspects, the described system performs searches that leverage a digital image as part of a search query to locate digital content of interest. In some aspects, the described system surfaces multiple user interface instrumentalities that include images of patterns, textures, or materials and that are selectable to initiate a visual search of digital content having a similar pattern, texture, or material. The described aspects also include pattern-based authentication in which the system determines authenticity of an item in an image based on a similarity of its visual characteristics to visual characteristics of known authentic items.

RELATED APPLICATION

This application is a continuation of and claims priority to U.S. patentapplication Ser. No. 16/388,473, filed Apr. 18, 2019, which is adivisional of and claims priority to U.S. patent application Ser. No.16/235,140, filed Dec. 28, 2018, which claims priority under 35 U.S.C. §119(e) to U.S. Provisional Patent Application No. 62/612,275, filed Dec.29, 2017, and titled “Computer Vision.” These disclosures are herebyincorporated by reference in their entireties.

BACKGROUND

Conventional text-based search systems depend on a user's ability toexpress a goal of a search using text. Thus, although these systems mayfunction well in instances in which a goal is readily expressible usingtext (e.g., find “red running shoes”), these systems may fail ininstances in which it is difficult to express this goal using text. Thisproblem is further exacerbated by a requirement in these conventionalsystems that a common understanding is reached between how items in asearch result are identified and techniques used to express the goal,for instance, that both a seller listing an item and prospective buyersearching for the item agree that the item is described with text as“red running shoes.” Further still, text descriptions provided by usersand describing items depicted in images may not be accurate.Accordingly, conventional systems that rely on these user-provideddescriptions to list items may propagate inaccurate descriptions of theitems, e.g., by surfacing a listing with an inaccurate description toother users.

SUMMARY

To overcome these problems, computer vision and image characteristicsearch is leveraged in a digital medium environment. Rather thansearching for images by comparing text queries to text data of images,the system described herein leverages visual search techniques where thesystem determines visual characteristics of objects depicted in imagesand compares the determined characteristics to visual characteristics ofother images, e.g., to identify whether the other images have similarvisual characteristics. In some aspects, the described system performssearches that leverage a digital image as part of a search query tolocate digital content of interest, e.g., listings of particular goodsand services. These digital images may be used to identifycharacteristics that otherwise may be difficult to describe, such aspatterns, a shape of an object (e.g., a collar having a particularshape, a type of heel on a shoe), and so forth. In some aspects, thedescribed system surfaces multiple user interface instrumentalities thatinclude images of patterns, textures, or materials and that areselectable to initiate a visual search of digital content having asimilar pattern, texture, or material. The described aspects alsoinclude pattern-based authentication in which the system determinesauthenticity of an item in an image based on a similarity of its visualcharacteristics to visual characteristics of known authentic items, suchas stitching patterns, component movement, and so forth.

This Summary introduces a selection of concepts in a simplified formthat are further described below in the Detailed Description. As such,this Summary is not intended to identify essential features of theclaimed subject matter, nor is it intended to be used as an aid indetermining the scope of the claimed subject matter.

BRIEF DESCRIPTION OF THE DRAWINGS

The detailed description is described with reference to the accompanyingfigures.

FIG. 1 is an illustration of an environment in an example implementationthat is operable to employ techniques described herein.

FIG. 2 depicts an example system in which operation of the cameraplatform manager module of FIG. 1 is depicted in greater detail.

FIGS. 3A-3B depict example implementations of user interaction with acamera platform to define and refine an image search based on inferredimage characteristics.

FIG. 4 depicts a procedure in an example implementation in whichmultiple digital images are used to infer characteristics to be used asa basis to perform a search for digital content.

FIGS. 5A-5B depict examples and implementations of image search.

FIG. 6 depicts another example environment that is operable to employaspects of listings with patterns, textures, and materials as describedherein.

FIG. 7 depicts an example user interface having images of a plurality ofdifferent patterns that are presented to enable visual searches to beperformed of visual listing data.

FIG. 8 depicts an example scenario in which an image is captured of anitem that is to be listed.

FIG. 9 depicts a procedure in an example implementation in which a userinterface having a plurality of selectable images of patterns is used toconduct a visual search for images.

FIG. 10 depicts another example environment that is operable to employaspects of pattern-based authentication as described herein.

FIG. 11 depicts an example scenario in which a client device user uses amobile device to capture visual content of an item that is selected tobe listed as authentic.

FIG. 12 depicts a procedure in an example implementation in which anitem to be listed with a listing service is determined authentic or notbased on known visual characteristics of authenticity.

FIG. 13 illustrates an example system including various components of anexample device that can be implemented as any type of computing deviceas described and/or utilized with reference to FIGS. 1-12 to implementembodiments of the techniques described herein.

DETAILED DESCRIPTION

Overview

Conventional text-based search systems depend on a user's ability toexpress a goal of a search using text. Thus, although these systems mayfunction well in instances in which a goal is readily expressible usingtext (e.g., find “red running shoes”), these systems may fail ininstances in which it is difficult to express this goal using text. Thisproblem is further exacerbated by a requirement in these conventionalsystems that a common understanding is reached between how items in asearch result are identified and techniques used to express the goal.

To overcome these problems, computer vision and image characteristicsearch is leveraged in a digital medium environment. Rather thanleverage image searches that compare text queries to text data (e.g.,metadata) of images, the system described herein leverages visual searchtechniques where the system determines visual characteristics of objectsdepicted in images and compares the determined characteristics to visualcharacteristics of other images, e.g., to identify whether the otherimages have similar visual characteristics.

In some aspects, the described system performs searches that leveragemultiple digital images as part of a search query to locate digitalcontent of interest, e.g., listings of particular goods and services.These digital images may be used to identify characteristics thatotherwise may be difficult to describe, such as patterns, a shape of anobject (e.g., a collar having a particular shape, a type of heel on ashoe), and so forth. In some scenarios, for instance, the describedsystem presents user interfaces that allow users to select multiple,different digital images or portions of images, e.g., from a repositoryof images and/or a live feed of images. The described system then usesrespective visual characteristics of the different digital images orportions to identify the digital content of interest. The describedsystem is also configured to receive user inputs providing a pattern,such as user inputs to draw a pattern (e.g., with a stylus or a touchinput device) via a user interface. The described system then usesvisual characteristics of the user-provided drawing as a basis forperforming an image-based search. Given such user input, the systemidentifies and presents search results that are based on theuser-provided drawing.

The described system is also capable of using a first characteristicdepicted in a first selected image (e.g., a shape of an object) and asecond characteristic depicted in a second selected image (e.g., apattern) to locate digital visual content (e.g., a single image) havingboth the first and second characteristics (e.g., depicting an objecthaving the shape and the pattern). This enables the described system tomatch search results with search goals that are difficult for users toexpress using text. Indeed, the described system relieves users ofhaving to convey their search goals using text and also allows them toconvey different parts of a search goal with different images.

In some aspects, the described system surfaces multiple user interfaceinstrumentalities that include images of patterns, textures, ormaterials. Each of these instrumentalities is selectable to initiate avisual search of digital content having a similar pattern, texture, ormaterial. It may be difficult, for instance, for a client device userwho is providing input to the system to describe patterns, such asparticular plaid patterns having varying numbers and sizes of verticaland horizontal bars. To this end, the system surfaces a user interfaceto searching users that includes multiple user-interfaceinstrumentalities depicting different patterns. These patterns areselectable, such as with touch input, stylus input, voice input, soforth. Responsive to such a selection, the system initiates a visualsearch using data (e.g., one or more feature vectors) describing theselected image of the pattern or a portion of it as a search query.

The described aspects also include pattern-based authentication. Here,the described system determines authenticity of an item depicted in animage based on a similarity of its visual characteristics to visualcharacteristics of known authentic items, such as stitching patterns,component movement, and so forth. In these scenarios, the system obtainsvisual content (e.g., one or more images or videos) of a product orservice that is to be listed as authentic and confirms or denies adesignation of authenticity. To confirm or deny an authenticdesignation, the pattern-based authentication system compares determinedvisual characteristics of the product or service depicted in obtainedvisual content to characteristics in visual content of a product orservice known to be authentic. To do so, the system may use image orvideo processing techniques along with visual pattern matching todetermine whether a captured pattern matches a known authentic pattern.

In the following discussion, an example environment is first describedthat may employ the techniques described herein. Example implementationdetails and procedures are then described which may be performed in theexample environment as well as other environments. Consequently,performance of the example procedures is not limited to the exampleenvironment and the example environment is not limited to performance ofthe example procedures.

Example Environment

FIG. 1 is an illustration of a digital medium environment 100 in anexample implementation that is operable to employ the techniquesdescribed herein. The illustrated environment 100 includes a computingdevice 102 that is communicatively coupled to a service provider system104 via a network 106. Computing devices that implement the computingdevice 102 and the service provider system 104 may be configured in avariety of ways.

A computing device, for instance, may be configured as a desktopcomputer, a laptop computer, a mobile device (e.g., assuming a handheldconfiguration such as a tablet or mobile phone), configured to be worn(e.g., as goggles as depicted in the illustrated environment 100) and soforth. Thus, a computing device may range from full resource deviceswith substantial memory and processor resources (e.g., personalcomputers, game consoles) to a low-resource device with limited memoryand/or processing resources (e.g., mobile devices). Additionally,although a single computing device is shown, a computing device may berepresentative of a plurality of different devices, such as multipleservers utilized by a business to perform operations “over the cloud”for the service provider system 104 as described in FIG. 13.

In the illustrated environment 100, the computing device 102 is depictedas being worn by a user 108 in a physical environment, e.g., a livingroom 110. In this example, the computing device 102 includes a digitalcamera 112 that is configured to capture digital images 114 of anoutside physical environment (e.g., the living room 110), such asthrough use of a charge coupled device (CCD) sensor. The captureddigital images 114 may then be stored as pixels in a computer-readablestorage medium and/or rendered for display by a display device, e.g.,LCD, OLED, LED, etc.

The computing device 102 also includes a camera platform manager module116 that is configured to implement and execute a camera platform 118(e.g., through use of a processing system and computer-readable storagemedia) that may serve as a basis for a variety of functionality. Thecamera platform 118, for instance, may implement a “live view” formed ofdigital images 114 taken of the physical environment of the computingdevice 102. These digital images 114 may then serve as a basis tosupport other functionality.

An example of this functionality is illustrated as an object inventorymanager module 120. The object inventory manager module 120 isrepresentative of functionality to manage an inventory of objects. Thismay include objects that are owned by the user 108 and/or objects thatare desired by the user 108, e.g., for purchase. This may be implementedby the object inventory manager module 120 through use of the cameraplatform 118 in a variety of ways.

In a first such example, the object inventory manager module 120 isconfigured to collect digital images 114. This may include digitalimages 114 of physical objects in the living room 110 in this example ordigital images captured of physical photos, e.g., from a magazine, apicture taken of a television screen or other display device, and so on.The digital image 114 may also be captured of a user interface output bythe computing device 102, e.g., as a screenshot from a frame buffer.

The object inventory manager module 120 includes object recognitionfunctionality to recognize objects included within the digital image114, e.g., via machine learning. From this, the object inventory managermodule 120 may collect data pertaining to these recognized objects. Datadescribing the recognized objects, for instance, may be communicated viathe network 106 to the service provider system 104. The service providersystem 104 includes a service manager module 122 that is configured toobtain data related to the objects (e.g., through use of a search) froma storage device 124. The service provider system 104 can thencommunicate this data back to the computing device 102 via the network106 for use by the object inventory manager module 120.

The object inventory manager module 120, for instance, may generateaugmented reality digital content 126 (illustrated as stored in astorage device 128) for output via a user interface of the computingdevice 102 as part of a “live feed” of digital images taken of thephysical environment, e.g., the living room 110. The AR digital content126, for instance, may describe characteristics of an object in theliving room 110, a brand name of the object, a price for which theobject is available for sale or purchase (e.g., via an online auction),and so forth. This AR digital content 126 is then displayed on the userinterface for viewing proximal to the object by the object inventorymanager module 120. In this way, the camera platform supportsfunctionality for the user 108 to “look around” the living room 110 andview additional object information and insight into characteristics ofobjects included within the physical environment. Further discussion ofthis example is described in relation to FIGS. 2-5 in the followingdiscussion.

FIG. 2 depicts a system 200 in an example implementation showingoperation of the camera platform manager module 116 of FIG. 1 in greaterdetail. The following discussion describes techniques that may beimplemented utilizing the previously described systems and devices.Aspects of the procedure as shown stepwise by the modules of FIG. 2 maybe implemented in hardware, firmware, software, or a combinationthereof. The procedure is shown as a set of blocks that specifyoperations performed by one or more devices and are not necessarilylimited to the orders shown for performing the operations by therespective blocks.

A digital image 114 is obtained by the camera platform manager module116. The digital image 114, for instance, may be captured using thedigital camera 112, as a screenshot captured from a frame buffer of thecomputing device 102, and so forth. The digital image 114 is thenprocessed by an object recognition module 202 to recognize an objectwithin the digital image 114. The object recognition module 202, forinstance, may employ a machine learning module 204 configured to employmodels 206 usable to recognize the object using machine learning, e.g.,neural networks, convolutional neural networks, deep learning networks,structured vector machines, decision trees, and so forth. The models206, for instance, may be trained using training digital images that aretagged with corresponding identifications.

In an implementation, these training digital images and tags areobtained from a commerce service provider system that are tagged bysellers using the system. As a result, a multitude of accurately taggedtraining digital images may be obtained with minimal computation anduser cost as opposed to conventional manual tagging techniques. Althoughillustrated as implemented locally by the computing device 102, thisfunctionality may also be implemented in whole or in part by the serviceprovider system 104 via the network 106.

Thus, the object recognition data 208 describes an object included inthe digital image 114. In accordance with the described techniques, thisobject recognition data 208 may correspond to text data describing therecognized object. Additionally or alternately, the object recognitiondata 208 may correspond to feature data (e.g., a feature vector), whichis indicative of visual characteristics of the recognized object. Anobject data collection module 210 is then employed to collect objectmetadata 212 that pertains to the recognized object. In scenarios wherethe object recognition data 208 corresponds to feature data, this objectmetadata 212 may include a textual description of the recognized object.This metadata collection may be performed locally through a search of alocal storage device and/or remotely through interaction with a servicemanager module 122 of a service provider system 104 via the network 106.

A variety of different types of object metadata 212 may be obtained froma variety of types of service provider systems 104. In one example, theservice provider system 104 provides object metadata 212 relating topurchase or sale of the object, e.g., product name, product description,price for purchase or sale (e.g., based on online auctions), and soforth. In another example, the service provider system 104 providesobject metadata 212 relating to customer reviews of the product, e.g., anumber of “stars” or other rating, textual reviews, and so forth. In afurther example, the object metadata 212 describes replacement parts ofthe object, e.g., filters, batteries, bulbs, and so forth. The objectmetadata 212 in this instance may be used to then order thesereplacement parts in an efficient and intuitive manner, e.g., throughselection of AR digital content formed from the metadata.

The object metadata 212 in this example is then provided to an augmentedreality (AR) configuration module 214. The AR configuration module 214,for instance, may be configured to generate AR digital content 126 fromthe object metadata 212 for display proximal to the object by an ARrendering module 216 to an output device 218, e.g., display device,audio output device, tactile output device, and so forth. The AR contentin this example may include both content supported along with a directview of a physical environment and content supported along with arecreated view of the physical environment. In this way, through use ofthe camera platform 118 as implemented by the camera platform managermodule 116, a user may simply “look around” using a live feed of digitalimages 114, select objects in the digital images 114, and obtainmetadata related to the objects.

In the replacement part example, the object recognition module 202 maybe used to first identify an object. The object recognition data 208produced based on this recognition may then be used as a “look up” tolocate replacement parts associated with the recognized object, e.g.,filters, bulbs, batteries, and so forth. AR digital content may then beoutput that is selectable to purchase these items in a direct view inthe user interface. In an example, this information is correlated with apast purchase history, such that the AR digital content may indicate“when” to replace the replacement part, when the replacement part waslast purchased, when it is due to be replaced, and so forth.

Having considered an example environment and system, consider now adiscussion of some example details of the techniques for computer visionand image characteristic search in in accordance with one or moreimplementations.

Computer Vision and Image Characteristic Search

In some aspects, computer vision and image characteristic search isleveraged in connection with active image search, which is discussed inrelation to FIGS. 3A-5. Aspects of computer vision and imagecharacteristic search also include leveraging listings with patterns,textures, and materials, which is discussed in relation to FIGS. 6-9. Instill further aspects of computer vision and image characteristicsearch, it is used in connection with pattern-based authentication,which is discussed in relation to FIGS. 10-12.

Active Image Search

FIG. 3A depicts an example implementation 300 of user interaction withthe camera platform 118 as implemented by the camera platform managermodule 116 to define and refine an image search based on inferred imagecharacteristics. This implementation 300 is illustrated using first,second, and third stages 302, 304, 306. FIG. 3B depicts another exampleimplementation 350 of user interaction with the camera platform 118 asimplemented by the camera platform manager module 116 to define andrefine an image search based on inferred image characteristics. Thisimplementation 350 is also illustrated using first, second, and thirdstages 352, 354, 356. FIG. 4 depicts a procedure 400 in an exampleimplementation in which multiple digital images are used to infercharacteristics to be used as a basis to perform a search for digitalcontent. FIGS. 5A-5B depict examples and implementations of imagesearch.

The following discussion describes techniques that may be implementedutilizing the previously described systems and devices. Aspects of theprocedure as shown stepwise may be implemented in hardware, firmware,software, or a combination thereof. The procedure is shown as a set ofblocks that specify operations performed by one or more devices and arenot necessarily limited to the orders shown for performing theoperations by the respective blocks. In portions of the followingdiscussion, reference will be made to FIGS. 3A-5.

At the first stage 302 of FIG. 3A, for instance, a user interface 308 isoutput by the output device 218, e.g., a touchscreen display device ofthe computing device 102. The user interface 308 includes digital imagesthat are usable as a basis to initiate definition of a search query. Forexample, a user may utter “find shoes” and the camera platform managermodule 116 outputs preconfigured digital images 310, 312 that are usableto further refine the search, such as running shoes or dress shoes.Thus, in this example the preconfigured digital images selected from arepository are used to refine intent of a user that initiates a search.

At the second stage 304, the user interface 308 is configured as a “livefeed” of digital images 114 obtained in real time from the digitalcamera 112 in this example. The live feed includes a digital image 314of a British flag that is selected by a user. In the illustrated example300, the user input is detected as a tap of a finger of the user's hand316 that is detected using touchscreen functionality of the outputdevice 218. In this way, a user may distinguish between multiple objectsdisplayed concurrently in the user interface 308 as well as indicateparticular parts of the object of interest, e.g., a pattern in thisinstance. Other examples are also contemplated, such as a spokenutterance or other gestures.

In response to the user selection of the second stage 304, the digitalimage 114 displayed in the user interface 308 is captured (e.g.,obtained from a frame buffer) along with the indication of the locationof the particular object selected, e.g., as guided by X/Y coordinates ofthe “tap.” The digital image 114 is then processed by the objectrecognition module 202 as described above to identify the object (e.g.,the pattern of the British flag in the illustrated example) and generatethe object recognition data 208.

The object recognition data 208 is then communicated to a serviceprovider system 104 in this example that is configured to supportpurchase and sale of goods. Accordingly, the service manager module 122in this example searches a storage device 124 for object metadata 212that pertains to the identified object. The object metadata 212, forinstance, may include digital content that includes an offer to purchasea good or service having the characteristics inferred from the digitalimages 310, 314.

As shown at the third stage 306, an example of digital content 318includes a digital image of a running shoe based on the digital image310 and having a pattern from digital image 314. The digital contentalso includes a name and price 320 (e.g., average price, price for sale,price to buy, etc.) of the object, which is displayed proximal to theobject, e.g., the Union Jack running shoe. In this way, the cameraplatform manager module 116 implements the camera platform 118.

User interaction and capture of the digital images may also be used toinfer which characteristics of the digital images are to be used as partof a search to infer a user's intent as part of a search. As shown atthe first stage 352 of FIG. 3B, for instance, a digital image 358 iscaptured of a dress having a pattern. In this example, the digital image358 includes an entire outline of the dress. Thus, the camera platformmanager module 116, through machine learning, may detect that theoverall shape of the dress is of interest to a user.

At the second stage 354, on the other hand, a digital image 360 iscaptured as a “close up” of a pattern. From this, the camera platformmanager module 116 may determine, using machine learning (e.g., objectrecognition) that the pattern, texture, and/or materials are of interestin this digital image 360. As a result, the overall shape from digitalimage 358 and the texture, materials, and/or pattern of the digitalimage 360 are used to locate digital content 362 (e.g., another digitalimage in a product listing) of a dress having a similar shape from thedigital image 358 and pattern from the digital image 360. In this way,digital images may be used to express user intent that otherwise wouldbe difficult if not impossible using text.

FIG. 4 depicts a procedure 400 in an example implementation of computervision and active image search. User interaction with a user interfacethat outputs a live feed of digital images is monitored (block 402). Auser, for instance, may view a live feed of digital images taken of aphysical environment of the user 108 and the computing device 102. Inthis way, a user may view objects of interest as well as characteristicsof those objects. The camera platform manager module 116 may monitor theuser 108's interaction with (e.g., viewing of) the live feed.

User selection is detected of at least one of the digital images (block404) by the camera platform manager module 116. A user, for instance,may press a button, tap a screen, utter a command, make a gesture, andso on to select one of the digital images from the live feed. The cameraplatform manager module 116 detects such user selection.

A characteristic is inferred from the selected digital image throughcomparison with at least one other digital image of the live feed (block406). As part of the user interaction, for instance, a user may “lookaround” a physical environment. As part of this, the user may then focusor “zoom in” or “zoom out” on a particular object, such as to view anoverall shape of the object, a pattern, texture, or material of theobject, and so on. By comparing the selected digital image with aprevious or subsequent digital image as part of the live feed, thecamera platform manager module 116 may determine what is of interest tothe user in the selected digital image. Object recognition using machinelearning may be used as part of this comparison by the camera platformmanager module 116, such as to compare tags generated using objectrecognition to determine commonality of the tags (e.g., a pattern inboth images) and/or a “new” tag, e.g., an overall shape caused by“zooming out.” Additionally or alternately, the camera platform managermodule 116 may compare feature data (e.g., feature vectors) generatedusing the object recognition to determine commonality of the featuredata (e.g., a pattern indicated by the feature data in both images)and/or “new” feature data, e.g., describing an overall shape captured by“zooming out.”

A search query is then generated based at least in part on the inferredcharacteristic (block 408). By way of example, the camera platformmanager module 116 generates a search query, which may include theselected digital image itself, object recognition data generated fromthe selected digital image, and so on. A search is then performed,either locally by the computing device 102 or remotely by the serviceprovider system 104. A search result is then output in the userinterface based on the search query (block 410) that includes digitalcontent located as part of the search, e.g., product listings, digitalimages, and so forth.

In active image search applications, users often have a mental pictureof desired content to be returned via an image search. The ultimategoals of an image search are to convey the user's mental picture to thesystem and overcome a difference between the lower-level imagerepresentation and higher-level conceptual content. In the techniquesdescribed in the following discussion, the system refines image searchresults by prompting users to indicate which image from a short list ofcandidate images is more reflective of the desired content.

In connection with active image search, an image search system—includedas part of or leveraged by the computing device 102 or the serviceprovider system 104—uses a feedback mechanism to refine search resultswithout using relative attribute annotations that are used byconventional systems. Instead, the image search system learns an imageembedding via training on relatively low-cost (e.g., in relation torelative attribute annotations) binary attribute labels already presentin many image databases. Given an initial query, the image search systemselects images to present to a user. At each iteration, the image searchsystem provides functionality that enables the user to simply select theimage which is the most visually similar to their target image.

As noted above, the image search system receives an initial query asinput. At each iteration, the image search system searches an imagerepository using “sampler” strategies to obtain an initial set ofcandidates. The image search system performs “Candidate Refinement” onthis set of images using informative, but computationally expensiveselection criteria. During the “user feedback” step, where a user inputis received to indicate if the new refined candidates are morerepresentative of the user's desired image or not. If the user selectsto accept a new image, for instance, the selected image becomes thequery received by the image search system for the next iteration. Unlikeconventional techniques which use costly relative attribute annotationsto learn an image representation, the techniques described hereinleverage low-cost binary labels that already exist in many datasets.

To learn a robust feature representation, the image search system uses aConditional Similarity Network (CSN). Thus the model 206 corresponds toa CSN in one or more implementations. In accordance with the describedtechniques, the service provider system 104 may include functionality touse a single network to learn an embedding for multiple attributesjointly by learning a masking function which selects features importantto each concept. This provides multiple views of the images in an imagerepository, which is more computationally efficient than trainingseparate embedding models for each concept. By training in this way, thesystem also factors the overall similarity between two images whentraining a representation. The resulting model 206 thus encouragessamples to separate into homogeneous subgroups in each embedding space.Therefore, the image search system can traverse an attribute embedding,e.g. heel height, such that a transition from one subgroup to adifferent sub group (e.g., a boot to a stiletto) in a single step wouldbe unlikely (even if both the boot and stiletto have the same sizedheel). By combining constraints with better exploitation of trainingdata, the described image search system improves over conventionalsystems in measuring the similarity between two images with regards to aspecific concept.

Another difference between the techniques leveraged by the describedsystems and conventional techniques in the continuing example is thatmodels are trained by the described system with binary attribute labelswhich already exist in many datasets and are relatively cheap to obtain.In one or more aspects, the described image search system refines imagesearch results using a simple feedback mechanism and without using therelative attribute annotations or attribute inputs required by manyconventional techniques. In some aspects, the image search system trainsa Deep-Q Network-based image selection criteria rather than only usinghand-crafted strategies. Additionally, the CSN is configured in a waythat encourages smooth transitions between different concepts as theimage search system traverses the learned embedding space.

In the following discussion, active image search is first described soas to be incorporated into the live feed techniques discussed above.This discussion includes a description of sampling strategies includinghow to select informative images using a Deep Q-Network. Modificationsare discussed to the CSN which are used during training to learn a setof embeddings used by active image search models, such as the model 206.

For image search with active feedback, the objective is for the imagesearch system to quickly locate a target image I_(t) in a database givena query q. While the initial query can take multiple forms (e.g.keywords, images, or sketches), it is provided as an image I_(q0) whichshares some desirable attribute with the target image. At eachiteration, the image search system selects K images to obtain feedbackon from a user.

Broadly speaking, active learning criteria focus on reducing uncertaintyin a current model or exploiting the information obtained in order tomake fine-grained distinctions. In practice, however, many searchengines provide means to filter results based on metadata labels. Forexample, when searching for clothing, a search engine may allow a userto filter results based on its category (e.g. pants), subcategory (e.g.jeans), and color, among others. Coupled with the initial query, suchfilters provide a strong signal to initialize an active learningalgorithm. Thus, the criteria that follow focus on exploitation of thisexisting knowledge.

As a baseline, the image search system performs an iterative nearestneighbors query to obtain candidate images. At each iteration, the imagesearch system determines the K-nearest neighbors to the current querythat have not been previously selected by the user and returns them.Each image selected by the user as the most relevant to their targetimage is used as the query in the next iteration.

The image search system uses the model 206 to select samples whichsatisfy a maximum number of feedback constraints provided by the user.For each iteration that a new candidate query I*_(q) _(i) ₁ is caused bythe user-because rather than indicate the search results aresatisfactory the user selects one of the provided images to furtherrefine the search—then I_(q) _(i) is farther away from the target imagethan I*_(q) _(i) ₊₁. In the following discussion, the term F representsa set of such feedback constraints made so far and the term

represents a set of previously unselected images in a database.Additionally, elements of F are tuples (I_(x), I_(y)) where I_(x) iscloser to the target image than I_(y). Based on this, the image searchsystem calculates the portion of constraints that a sample satisfies. Byway of example, the image search system calculates this portionaccording to the following equation:

S ⁡ ( I o ❘ l = 1 , F ) = 1  F  ⁢ ∑ ∀ I x n , I y n ∈ F ⁢ fcs ⁢ ( I o , Ix n , I y n ) ,

Here, the term

_(fcs) represents an indicator function that uses a distance function Dand returns one if D(I_(o), I_(x) _(i) )<D(I_(o), I_(y) _(i) ). Inaccordance with one or more implementations, a scenario where l=1indicates that a sample satisfies the portion of constraints. Giventhis, criteria for a next proposed query can be represented in one ormore examples as:

$I_{q_{i} + 1}^{*} = {\arg\;{\max\limits_{I_{o} \in \mathcal{O}}{{S\left( {{{I_{o}❘l} = 1},F} \right)}.}}}$

The image search system is configured to break ties using nearestneighbors sampling between the candidates and the query image.

While sampling strategies can provide likely candidates based on acurrent model, these strategies do not take into account an amount asample informs search results. Many conventional techniques that providesuch information are computationally expensive, making it infeasible torun over an entire database. As such, the described image search systemidentifies a short list of likely candidates C using image samplingcriteria, and then re-ranks them based on how informative thesecandidates are to the current model.

In expected error reduction, this refinement strategy leveraged by theimage search system focuses on reducing generalization error of thecurrent model for the desired target image. As such, the search strategydeployed by the image search system balances exploration andexploitation criteria. In one or more implementations, the image searchsystem measures entropy of the current model by calculating the portionof constraints an image satisfies. By way of example, the image searchsystem may calculate the portion of constraints satisfied in accordancewith the following:

${H(F)} = {- {\sum\limits_{I_{o} \in \mathcal{O}}{\sum\limits_{1}{{S\left( {{I_{o}❘l},F} \right)}{{\log\left( {S\left( {\left. I_{o} \middle| l \right.,F} \right)} \right)}.}}}}}$

Here, note that S(I_(o)|l=0, F) is defined as 1−S(I_(o)|l=1, F). Furtherthe term I_(tϕ) represents a current best guess, which is used as aproxy for the target image when predicting the user's response r. Theimage search system estimates a likelihood that a new constraint issatisfied by determining a likelihood that a candidate image shares thesame attributes with the target image. The image search system obtainsthis likelihood by converting the distances in an attribute's embeddingspace to a probability. The image search system learns scalingparameters ϕ based on a training set. Given this, the image searchsystem selects the candidate images according to the following:

$I_{q_{i} + 1}^{*} = {\arg\;{\max\limits_{I_{c} \in C}{\sum\limits_{r}{{\sigma\left( {{r❘{D\left( {I_{c},I_{t*}} \right)}},\ \phi} \right)}{{H\left( {F\bigcup\left( {I_{c},r} \right)} \right)}.}}}}}$

In systems that use Learned Re-ranking Criteria, a learned criteriaadapts to the exact task and dataset. To this end, the service providersystem 104 trains a Deep Q-Network (DQN) with experience replay to learnhow to select informative images as the candidate images. In thisparadigm, the system learns a function Q that estimates the reward ρ bytaking some action given the current state of the system Ψ. Inaccordance with the described techniques, a value p is defined as achange in a percentile rank of the target image under the current modelafter obtaining feedback from the user. Further, the current state ofthe system Ψ may be determined as a concatenation of the difference ofthe embedding representation of the query image and all of the candidateimages being re-ranked. FIG. 5A depicts an example 500 of animplementation of a structure of a DQN model. In one or moreimplementations, the model 206 is implemented based on this structure.In any case, in operation, the image search system uses the selectioncriteria to maximize an expected reward if image I_(c) is selected topresent to the user:

$I_{q_{i + 1}}^{*} = {\arg\;{\max\limits_{I_{c} \in C}{{Q\left( {I_{c},\Psi} \right)}.}}}$

In accordance with the described techniques, this model is trained usingHuber loss on top of a temporal difference error between the expectedand observed rewards. With reference to the illustrated example 500, theimage search system uses the function ψ(I_(c), I_(q)) to return thedifference between each image's feature representation. Further, theoutput dimension of FC3 is |C|, which represents a predicted reward ofselecting its corresponding candidate.

Broadly speaking, the image search system trains a set of embeddings tocompare two images, where each embedding represents a differentattribute to be captured. In implementations where the model 206 is aCSN model, the CSN model is designed to learn a disentangled embeddingfor different attributes in a single model. In this way, a general imagerepresentation is learned through the image encoding layers of themodel. The image search system then applies a trained mask to therepresentation to isolate the features important to that specificattribute. This enables each embedding to share some common parametersacross concepts, while the mask is tasked with transforming the featuresinto a discriminative representation. After obtaining the generalembedding features between two images G_(i), G_(j), the image searchsystem compares their general embedding features. By way of example, theimage search system compares them using a masked distance function, suchas:

D _(m)(G _(i) G _(j) ; m _(a))=∥G _(i) *m _(a) −G _(j) *m _(a)∥₂,

Here, the term m_(a) is a mask for some attribute and the operator *denotes an element-wise multiplication. In one or more implementations,the service provider system 104 trains the CSN model using a tripletloss function such as:

L _(T)(G _(x) , G _(y) , G _(z) ; m _(a))=max{0, D _(m)(G _(x) , G _(y); m _(a))−D _(m)(G _(x) , G _(y) ; m _(a))+h}.

The service provider system 104 also configures the embedded features Gto be L2 regularized to encourage regularity in the latent space. Inaddition, L1 regularization is performed on the masks m to encourage asparse feature selection. Based on this, the resulting total lossfunction with which the model is trained is:

L _(CSN)(G _(x) , G _(y) , G _(z) ; m _(a))=L _(T)(G _(x) , G _(y) , G_(z) ; m _(a))+λ₁ ∥G∥ ₂ ²+λ₂ ∥m _(a)∥_(a)

Since the goal is to traverse the model's embeddings in order to locatea target image, it is desirable that the embeddings provide naturaltransitions from image to image. For example, transitioning from ananchor image to the rightmost image in the example 510 of FIG. 5B wouldbe considered a significant divergence. The center image, while stilldifferent, is a less divergent transition even though all three imagesof the example 510 belong to the boot category. Therefore, to makeembedding spaces intuitive overall, the described system accounts forsimilarity between two images beyond an attribute being encoded. Given aset of attributes represented by A_(x), A_(y), A_(z), for each of theimages in a training triplet, the difference in shared attributes iscomputed between the negative and positive pairs. By way of example, thesystem computes the differences in shared attributes according to thefollowing:

${w\left( {A_{x},A_{y},A_{z}} \right)} = {\max\left\{ {0,{\frac{1}{ɛ}\left( {{{A_{x}\bigcap A_{y}}} - {{A_{x}\bigcap A_{y}}}} \right)}} \right\}}$

Here, the term ε represents a number of embeddings being trained. Thesystem prevents negative values of w to maintain a minimum marginbetween negative and positive pairs of the triplet. In one or moreimplementations the system determines a new margin, which may be definedas follows:

h′(A _(x) , A _(y) , A _(z))=h+ηw(A _(x) , A _(y) , A _(z))

Here, the term η is the scalar parameter. It is to be appreciated thatvisual searches performed with models trained in manners different fromthose described just above may be leveraged without departing from thespirit or scope of the described techniques. As noted in the abovediscussion, however, image searches that are based on visualcharacteristics can be used in scenarios where it may be difficult forusers to accurately convey, in words, a desired target image or item. Inthis context, consider the following discussion of listings withpatterns, textures, and materials.

Listings with Patterns, Textures, and Materials

An increasing number of service provider systems and associatedapplications surface listings to client devices for various products dueto advances in computing technologies. Conventional techniques forgenerating such listings typically involve associating text (e.g., texttags) with a listing that is descriptive of the product corresponding tothe listing. When listing a shirt to be surfaced via a platform (e.g.,for sale via an e-commerce platform), for instance, conventional systemsmay allow a client device user to enter or select textual descriptionsof the shirt, such as to describe the shirt's color, pattern, texture,or material. Such conventional systems also employ text-based searchtechniques to identify items. In other words, these conventionaltechniques consider the text associated with listed items, and surfaceitems that are associated with text that matches the searched-for text.However, attributes of many items (e.g., patterns, textures, materials,and so on) can be difficult to describe using text. It may be difficult,for instance, for a client device user who is providing input to asystem to list an item, or for a client device user who is providinginput to the system to search through listed items, to describepatterns, such as particular plaid patterns having varying numbers andsizes of vertical and horizontal bars.

To overcome these problems, the described system leverages computervision and image characteristic search. In contrast to conventionaltechniques, this system does not rely on a textual description of anitem, e.g., one that is associated with the item by the system or aclient device user. Instead, the system leverages one or more images (orvideos) of an item being listed, and determines visual characteristicsof the item, such as patterns, textures, materials, and so on,automatically from the images and/or videos. As part of this, the systemperforms one or more image processing techniques on visual digitalcontent (e.g., images or videos) provided to the system in connectionwith listing an item. Based on this image processing, the systemgenerates visual data that describes the characteristics, e.g., one ormore image feature vectors that are capable of describing a pattern, atexture, and/or a material of the item depicted in the image.

Additionally, the system does not rely on text as a basis for searchingthough listed items. Instead, the system leverages image (or video)queries to perform a visual search of listed items, such as by comparingfeature vectors describing a pattern of a query image to the featurevectors describing patterns of listed items. To obtain a query image,the system is configured to surface a user interface, to searchingusers, that enables a searching user to upload an image (or video) as asearch query. The system is also configured to present multipleuser-interface instrumentalities depicting different patterns and thatare selectable as a search query to initiate a search. Additionally oralternately, the system is configured to present a user interface thatallows a user to provide a pattern, such as by providing user inputs todraw a pattern (e.g., with a stylus or touch input device) via the userinterface.

The system may use an entirety or a portion of this uploaded, selected,or user-drawn image as the search query image. Regardless of whether thequery image is uploaded, selected, or drawn via the interface by asearching user, the system may perform the one or more image processingtechniques on the query image. For instance, the system performs asearch based on an uploaded image, a user-selected pattern, or auser-provided drawing. In so doing, the system generates non-textualdata indicative of the characteristics of the query image, e.g., one ormore image feature vectors that are capable of describing the pattern,texture, and/or material depicted in at least a portion of the queryimage. Given such user input and the data indicative of thecharacteristics, the system identifies and presents search results, forinstance, that are based on the uploaded image, the user-selectedpattern, or the user provided drawing.

FIG. 6 depicts another example environment 600 that is operable toemploy aspects of listings with patterns, textures, and materials. It isto be appreciated that the components of the illustrated environment 600may correspond to the components and systems discussed in relation tothe other figures described herein without departing from the spirit orscope of the techniques.

The illustrated example 600 includes the computing device 102, anothercomputing device 602, and the service provider system 104, which arecommunicatively coupled via the network 106. The computing device 102and the other computing device 602 are each illustrated with acommunication module 610, 612 which represent functionality to enablethis communication. In the illustrated example 600, the computing device102 is depicted providing listing data 614 having visual listing data616 to the service provider system 104. In this example, the computingdevice 102 may be associated with a client device user that is listingan item via the service provider system 104, e.g., listing the item forsale via the service provider system. Further, the visual listing data616 may correspond to one or more images or videos of the item listedvia the listing data.

In this example, the service provider system 104 is illustrated with alisting system 618 having a computer vision module 620 and a patternrecognition module 622. The computer vision module 620 representsfunctionality of the listing system 618 to process the visual listingdata 616 (images and/or videos) of the received listing data 614, e.g.,to generate different data (feature vectors) to describe visualcharacteristics of the listed item. The computer vision module 620 alsorepresents functionality to perform a variety of other computer visiontechniques with respect to visual content of items listed via a listingservice and also to perform visual searches for items listed via theservice. After processing received visual listing data 616, the listingsystem 618 may cause this visual information to be stored as part of thelisting data 624 at the service provider system 104. This storedvisual-specific information is illustrated in storage 124 as visualcharacteristic listing data 628. The listing data 624 is shown withellipses to indicate that there may be a variety of the visualcharacteristic listing data 628 for a particular item being listed andalso that the listing data 624 may include the visual characteristiclisting data 628 for multiple different items. To this extent, thepattern recognition module 622 may represent functionality of thelisting system 618 to detect patterns, textures, and/or materials ofitems that are depicted in visual content. The pattern recognitionmodule 622 also represents functionality to generate informationindicative of detected patterns, textures, and/or materials, e.g., imagefeature vectors.

The other computing device 602 is depicted communicating query data 630that includes visual query data 632 to the service provider system 104.The visual query data 632 may correspond to one or more images and/orvideos selected for upload by a client device user of the othercomputing device 602, one or more images and/or videos selected by theclient device user via a user interface of the service provider system104, or one or more images generated by the other computing device 602based on user-provided input received to draw a pattern. Broadlyspeaking, the client device user of the other computing device 602 mayhave provided the query data 630 to search the listings of the serviceprovider system 104, e.g. to search the listings to purchase an itemlisted. In any case, the computer vision module 620 may leverage thevisual query data 632 to perform a visual search of the visualcharacteristic listing data 628 to identify listings that match thesearch query, such as listed items having patterns, textures, and/ormaterials that are visually similar or the same as patterns, textures,and/or materials depicted in the visual query data 632. The listingsystem 618 can then generate query response data 634 for communicationback to the other computing device 602. In general, this query responsedata 634 is indicative of the identified listings. The query responsedata 634 may correspond to a list of the listings (or a subset of them)that are a match with the visual query data 632. The query response data634 enables the other computing device 602 to present digital content ofcorresponding items for purchase via a user interface, e.g., a listingof the items including images of them. In the context of user interfacesto search for listed items, consider FIG. 7, which depicts an example ofa user interface 700.

The example user interface 700 includes multiple user interfaceinstrumentalities 702 that have images of different patterns. Theseinstrumentalities are selectable to generate a search query for listingsthat are visually similar or the same as the selected pattern. The userinterface 700 also includes an instrumentality 704 that enables a userto upload an image of a pattern to be used as a basis for a search andanother instrumentality 706 that enables a user to upload a video of apattern to be used as a basis for a search. The user interface 700 canbe presented to a searching user to enable visual searches to beperformed of the visual characteristic listing data 628. In operation, aclient device user may select one of these instrumentalities with thepattern images, which can then serve as a query image to perform thevisual search.

FIG. 8 illustrates an example scenario 800 in which an image is capturedof an item that is to be listed. In this scenario, the image is capturedusing a mobile device (which corresponds to the computing device 102).The captured image then serves as the visual listing data 616, which maybe processed by the listing system 618 to describe visualcharacteristics of depicted items.

FIG. 9 depicts a procedure 900 in an example implementation in which auser interface having a plurality of selectable images of patterns isused to conduct a visual search for images.

A plurality of images each depicting a different pattern is presentedvia a user interface (block 902). By way of example, the other computingdevice 602 displays the user interface 700 which includes the multipleuser interface instrumentalities 702 that have images of differentpatterns and are selectable to generate a search query for listings thatare visually similar or the same as a selected pattern. A selection ofone of the images is received (block 904). By way of example, the othercomputing device 602 receives a selection of one of the multiple userinterface instrumentalities 702, such as a touch selection, avoice-based selection, a stylus selection, a mouse selection, and soforth.

A search query including the selected image is transmitted to a listingservice (block 906). In accordance with the principles discussed herein,the listing service is configured to generate data describing arespective pattern of the selected image and identify listed itemshaving a similar pattern. By way of example, the other computing device602 configures the image selected at block 904 as the visual query data632 and packages it as part of the query data 630. The communicationmodule 612 then communicates the query data 630 over the network to theservice provider system 104. In this scenario, the service providersystem 104 leverages the functionality of the listing system (e.g., thecomputer vision module 620 and the pattern recognition module 622) togenerate data describing a respective pattern of the visual query data632 and identify listings in the listing data 624 having a similarpattern, e.g., through a comparison with the visual characteristiclisting data 628.

Search results that include at least one identified item having asimilar pattern are received (block 908). By way of example, the othercomputing device 602 receives the query response data 634, whichincludes at least one item having a similar pattern as identified by thelisting system 618. Digital content depicting at least one of theidentified items is presented via the user interface (block 910). By wayof example, the user interface 700 presents images 708, which in thisexample represent items identified by the listing system 618.

In one or more implementations, the service provider system 104 alsogenerates analytics based on the visual query data 632, such asanalytics indicative of the patterns, textures, and materials for whichusers search. The service provider system 104 can then provide thisinformation to entities that list and/or produce products. Consider anexample in which client device users perform a multitude of searches fora similar plaid pattern during the winter. The service provider system104 may generate analytics indicating that users are searching for thisplaid, and that only a few search results are returned to the clientdevice users, because there are very few available products listedhaving this or a same pattern. A listing client device user may utilizethis information to list more products having this pattern. An entitythat produces products may utilize this information to produce moreproducts having this pattern.

The described system also supports the advantage of describing patterns,textures, and materials using data that results from visual processingtechniques rather than rely on human-understandable text descriptions.This can result in more accurate descriptions of patterns, textures, andmaterials of an item that is being listed than the human-understandabletext descriptions. This also enables the system to easily identifylisted items having characteristics that visually match, or are visuallysimilar, to queried for patterns, textures, and materials. Consider nowthe following discussion of using computer vision and imagecharacteristic search for pattern-based authentication.

Pattern-Based Authentication

Conventional systems for enabling client device users to list productsand services for surfacing to other client device users generallyprovide the listing users control over how listed products and servicesare described in corresponding listings. Typically, the only mechanismsof these conventionally-configured systems to ensure that listing usersare listing what they say they are listing are reviews of other clientdevice users that have followed through with the listing, e.g., bypurchasing, renting, and so on, the listed product or service. At thatpoint, however, the client device users have already committed someamount of resources (e.g., time, money, and so forth) to followingthrough with the listed item. These systems do not prevent at least afew users from following through with listed products or services thatfail to meet the provided description. One example of this scenario islisting users listing counterfeit products (e.g., handbags, sunglasses,watches, and so forth) as being authentic. Client device users that donot trust surfaced descriptions of listed products and services maysimply not use a platform that lists products or services withuntrustworthy descriptions.

To overcome these problems, computer vision and image characteristicsearch is used for pattern-based authentication in a digital mediumenvironment. The pattern-based authentication system obtains visualcontent (e.g., one or more images or videos) of a product or servicethat is to be listed and confirms or denies a designation ofauthenticity. To confirm or deny an authentic designation, thepattern-based authentication system compares determined visualcharacteristics of the product or service depicted in obtained visualcontent to characteristics in visual content of a product or serviceknown to be authentic. The pattern-based authentication system may useimage or video processing techniques along with visual pattern matchingto determine whether a captured pattern matches a known authenticpattern.

By way of example, authentic handbags of a particular brand may havestitching that is indicative of authenticity. The pattern-basedauthentication system may thus require listing client device users, thatare listing such a handbag as authentic, to also provide an image orvideo of the stitching. The pattern-based authentication system can thenuse image processing techniques to compare the stitching pattern of theprovided image to a known authentic stitching pattern depicted in animage. If the pattern-based authentication system determines that thestitching pattern of the provided image matches the known authenticstitching pattern, the pattern-based authentication system allows alisting user to list the handbag as authentic. In contrast, for watchesthe pattern-based authentication system may require listing users toupload video showing how a watch's second hand rotates around the face.The pattern-based authentication system may determine authenticity basedon comparison of the movement in the provided video to video of a secondhand of a known authentic watch.

FIG. 10 depicts another example environment 1000 that is operable toemploy aspects of pattern-based authentication. For instance, thepattern-based authentication system may be implemented as part of theservice provider system 104 or accessible to the service provider system104 to provide the described functionality. It is to be appreciated thatthe components of the illustrated environment 1000 may correspond to thecomponents and systems discussed in relation to the other figuresdescribed herein without departing from the spirit or scope of thetechniques.

The illustrated environment 1000 includes the computing device 102 andthe service provider system 104, which may be configured as describedabove. In the illustrated example 1000, the listing data 614 includesauthentic designation data 1002 and visual authenticity data 1004. Inaccordance with the described techniques, the authentic designation data1002 corresponds to a user selection indicating that the listed productor service is “authentic,” e.g., the listed item is an authentic brandedhandbag or an authentic branded watch. Based on such a selection, thecomputing device 102 may prompt the user to also provide visual contentfor confirming authenticity of the product or service being listed.Absent confirmation of authenticity from visual content, thepattern-based authentication system may not allow the listing to includean authentic designation.

In any case, the visual authenticity data 1004 represents one or moreimages or videos provided by the listing user for confirming theauthenticity of the listed product or service. The listing system 618may employ the computer vision module 620 and the pattern recognitionmodule 622 to determine from the visual authenticity data 1004 whetherthe product being listed is authentic, e.g., by comparing one or morepatterns captured in the visual authenticity data 1004 to knownauthentic patterns. The pattern-based authentication system allows thelisting user to list the product or service as authentic or notdepending on the determination. The illustrated visual authenticationlisting data 1006 indicates whether products or services being listedcorrespond to authentic products and services or not. This data canserve as the basis for allowing the listing user to list a product orservice as authentic or not. In accordance with the describedtechniques, consider FIG. 11.

FIG. 11 depicts an example scenario 1100 in which a client device useruses the computing device 102 (depicted in this example as a mobiledevice) to capture visual content of an item the user has selected tolist as authentic. In the illustrated scenario 1100, the product thatthe user is listing is handbag 1102. Prior to the illustrated portion ofthe scenario 1100, the pattern-based authentication system may haveprompted the client device user to capture an image or video of aportion of the handbag 1102 near its closure mechanism and includingstitching of the handbag. In this example 1100, the computing device 102is depicting having captured or capturing visual content correspondingto this portion, a preview of which is shown displayed on display screen1104 of the computing device 102. The computing device 102 packages thecaptured visual content with the listing data 614 as the visualauthenticity data 1004. The service provider system 104 can then usethis data to determine authenticity of the handbag 1102. It is to beappreciated that authenticity of a variety of products and services maybe determined in accordance with the described techniques.

FIG. 12 depicts a procedure 1200 in an example implementation in whichan item to be listed with a listing service is determined authentic ornot based on known visual characteristics of authenticity.

A selection is received via a user interface indicating to list an itemon a listing service with an authentic designation (block 1202). By wayof example, the computing device 102 receives a selection made via auser interface to list the handbag 1102 on a listing service associatedwith the service provider system 104 and with an authentic designation.As noted above, the selection to list an item as authentic may bedescribed by the authentic designation data 1002. Accordingly, theservice provider system 104 receives data describing that a user hasselected to list the handbag 1102 with an authentic designation.

Digital visual content depicting visual characteristics of the itembeing listed is received (block 1204). By way of example, a user deploysfunctionality of the computing device 102 to capture an image of thehandbag 1102, such as the image displayed via the display screen 1104 inFIG. 11. The computing device 102 packages this captured image as thevisual authenticity data 1004. The service provider system 104 thusreceives the image of the handbag 1102 as part of the listing data 614.

The visual characteristics are compared to known visual characteristicsof authentic items (block 1206). By way of example, the listing system618 leverages functionality to compare the visual characteristicsdepicted in the image of the handbag 1102 to known visualcharacteristics of authentic handbags, as described by visualauthentication listing data 1006. A determination is made as to whetherthe item being listed is authentic or not based on the comparing (block1208). By way of example, the listing system 618 makes a determinationas to whether the handbag 1102 is authentic or not based on thecomparing of block 1206.

An indication that the item is allowed to be listed with an authenticdesignation is surfaced responsive to a determination that the item isauthentic (block 1210). By way of example, the listing system 618determines at block 1208 that the handbag 1102 is authentic based on thecomparing of block 1206. Responsive to this, the service provider system104 surfaces an indication (e.g., by communicating it to the computingdevice 102) that the handbag 1102 is allowed to be listed with anauthentic designation.

An indication that the item is not allowed to be listed with anauthentic designation is surfaced responsive to a determination that theitem is not authentic (block 1212). By way of example, the listingsystem 618 determines at block 1208 that the handbag 1102 is notauthentic based on the comparing of block 1206. Responsive to this, theservice provider system 104 surfaces an indication (e.g., bycommunicating it to the computing device 102) that the handbag 1102 isnot allowed to be listed with the authentic designation.

Having described example techniques and procedures in accordance withone or more implementations, consider now an example system and devicethat can be utilized to implement the various techniques describedherein.

Example System and Device

FIG. 13 illustrates an example system generally at 1300 that includes anexample computing device 1302 that is representative of one or morecomputing systems and/or devices that may implement the varioustechniques described herein. This is illustrated through inclusion ofthe camera platform manager module 116. The computing device 1302 maybe, for example, a server of a service provider, a device associatedwith a client (e.g., a client device), an on-chip system, and/or anyother suitable computing device or computing system.

The example computing device 1302 as illustrated includes a processingsystem 1304, one or more computer-readable media 1306, and one or moreI/O interfaces 1308 that are communicatively coupled, one to another.Although not shown, the computing device 1302 may further include asystem bus or other data and command transfer system that couples thevarious components, one to another. A system bus can include any one orcombination of different bus structures, such as a memory bus or memorycontroller, a peripheral bus, a universal serial bus, and/or a processoror local bus that utilizes any of a variety of bus architectures. Avariety of other examples are also contemplated, such as control anddata lines.

The processing system 1304 is representative of functionality to performone or more operations using hardware. Accordingly, the processingsystem 1304 is illustrated as including hardware elements 1310 that maybe configured as processors, functional blocks, and so forth. This mayinclude implementation in hardware as an application specific integratedcircuit or other logic device formed using one or more semiconductors.The hardware elements 1310 are not limited by the materials from whichthey are formed or the processing mechanisms employed therein. Forexample, processors may be comprised of semiconductor(s) and/ortransistors (e.g., electronic integrated circuits (ICs)). In such acontext, processor-executable instructions may beelectronically-executable instructions.

The computer-readable storage media 1306 is illustrated as includingmemory/storage 1312. The memory/storage 1312 represents memory/storagecapacity associated with one or more computer-readable media. Thememory/storage component 1312 may include volatile media (such as randomaccess memory (RAM)) and/or nonvolatile media (such as read only memory(ROM), Flash memory, optical disks, magnetic disks, and so forth). Thememory/storage component 1312 may include fixed media (e.g., RAM, ROM, afixed hard drive, and so on) as well as removable media (e.g., Flashmemory, a removable hard drive, an optical disc, and so forth). Thecomputer-readable media 1306 may be configured in a variety of otherways as further described below.

Input/output interface(s) 1308 are representative of functionality toallow a user to enter commands and information to computing device 1302,and also allow information to be presented to the user and/or othercomponents or devices using various input/output devices. Examples ofinput devices include a keyboard, a cursor control device (e.g., amouse), a microphone, a scanner, touch functionality (e.g., capacitiveor other sensors that are configured to detect physical touch), a camera(e.g., which may employ visible or non-visible wavelengths such asinfrared frequencies to recognize movement as gestures that do notinvolve touch), and so forth. Examples of output devices include adisplay device (e.g., a monitor or projector), speakers, a printer, anetwork card, tactile-response device, and so forth. Thus, the computingdevice 1302 may be configured in a variety of ways as further describedbelow to support user interaction.

Various techniques may be described herein in the general context ofsoftware, hardware elements, or program modules. Generally, such modulesinclude routines, programs, objects, elements, components, datastructures, and so forth that perform particular tasks or implementparticular abstract data types. The terms “module,” “functionality,” and“component” as used herein generally represent software, firmware,hardware, or a combination thereof. The features of the techniquesdescribed herein are platform-independent, meaning that the techniquesmay be implemented on a variety of commercial computing platforms havinga variety of processors.

An implementation of the described modules and techniques may be storedon or transmitted across some form of computer-readable media. Thecomputer-readable media may include a variety of media that may beaccessed by the computing device 1302. By way of example, and notlimitation, computer-readable media may include “computer-readablestorage media” and “computer-readable signal media.”

“Computer-readable storage media” may refer to media and/or devices thatenable persistent and/or non-transitory storage of information incontrast to mere signal transmission, carrier waves, or signals per se.Thus, computer-readable storage media refers to non-signal bearingmedia. The computer-readable storage media includes hardware such asvolatile and non-volatile, removable and non-removable media and/orstorage devices implemented in a method or technology suitable forstorage of information such as computer readable instructions, datastructures, program modules, logic elements/circuits, or other data.Examples of computer-readable storage media may include, but are notlimited to, RAM, ROM, EEPROM, flash memory or other memory technology,CD-ROM, digital versatile disks (DVD) or other optical storage, harddisks, magnetic cassettes, magnetic tape, magnetic disk storage or othermagnetic storage devices, or other storage device, tangible media, orarticle of manufacture suitable to store the desired information andwhich may be accessed by a computer.

“Computer-readable signal media” may refer to a signal-bearing mediumthat is configured to transmit instructions to the hardware of thecomputing device 1302, such as via a network. Signal media typically mayembody computer readable instructions, data structures, program modules,or other data in a modulated data signal, such as carrier waves, datasignals, or other transport mechanism. Signal media also include anyinformation delivery media. The term “modulated data signal” means asignal that has one or more of its characteristics set or changed insuch a manner as to encode information in the signal. By way of example,and not limitation, communication media include wired media such as awired network or direct-wired connection, and wireless media such asacoustic, RF, infrared, and other wireless media.

As previously described, hardware elements 1310 and computer-readablemedia 1306 are representative of modules, programmable device logicand/or fixed device logic implemented in a hardware form that may beemployed in some embodiments to implement at least some aspects of thetechniques described herein, such as to perform one or moreinstructions. Hardware may include components of an integrated circuitor on-chip system, an application-specific integrated circuit (ASIC), afield-programmable gate array (FPGA), a complex programmable logicdevice (CPLD), and other implementations in silicon or other hardware.In this context, hardware may operate as a processing device thatperforms program tasks defined by instructions and/or logic embodied bythe hardware as well as a hardware utilized to store instructions forexecution, e.g., the computer-readable storage media describedpreviously.

Combinations of the foregoing may also be employed to implement varioustechniques described herein. Accordingly, software, hardware, orexecutable modules may be implemented as one or more instructions and/orlogic embodied on some form of computer-readable storage media and/or byone or more hardware elements 1310. The computing device 1302 may beconfigured to implement particular instructions and/or functionscorresponding to the software and/or hardware modules. Accordingly,implementation of a module that is executable by the computing device1302 as software may be achieved at least partially in hardware, e.g.,through use of computer-readable storage media and/or hardware elements1310 of the processing system 1304. The instructions and/or functionsmay be executable/operable by one or more articles of manufacture (forexample, one or more computing devices 1302 and/or processing systems1304) to implement techniques, modules, and examples described herein.

The techniques described herein may be supported by variousconfigurations of the computing device 1302 and are not limited to thespecific examples of the techniques described herein. This functionalitymay also be implemented all or in part through use of a distributedsystem, such as over a “cloud” 1314 via a platform 1316 as describedbelow.

The cloud 1314 includes and/or is representative of a platform 1316 forresources 1318. The platform 1316 abstracts underlying functionality ofhardware (e.g., servers) and software resources of the cloud 1314. Theresources 1318 may include applications and/or data that can be utilizedwhile computer processing is executed on servers that are remote fromthe computing device 1302. Resources 1318 can also include servicesprovided over the Internet and/or through a subscriber network, such asa cellular or Wi-Fi network.

The platform 1316 may abstract resources and functions to connect thecomputing device 1302 with other computing devices. The platform 1316may also serve to abstract scaling of resources to provide acorresponding level of scale to encountered demand for the resources1318 that are implemented via the platform 1316. Accordingly, in aninterconnected device embodiment, implementation of functionalitydescribed herein may be distributed throughout the system 1300. Forexample, the functionality may be implemented in part on the computingdevice 1302 as well as via the platform 1316 that abstracts thefunctionality of the cloud 1314.

CONCLUSION

Although the techniques have been described in language specific tostructural features and/or methodological acts, it is to be understoodthat the subject matter defined in the appended claims is notnecessarily limited to the specific features or acts described. Rather,the specific features and acts are disclosed as example forms ofimplementing the claimed subject matter.

What is claimed is:
 1. A method implemented by at least one computingdevice, the method comprising: displaying, by the at least one computingdevice and via a user interface, a prompt having authenticationinstructions for providing digital visual content depicting one or morevisual characteristics of an item being listed on a listing service, theauthentication instructions instructing a user to provide the digitalvisual content of a specified portion of the item being listed depictingmovement of the item being listed; receiving, by the at least onecomputing device, the digital visual content that depicts the one ormore visual characteristics of the item being listed; generating, by theat least one computing device, non-textual data indicative of the one ormore visual characteristics of the item from the digital visual content;comparing, by the at least one computing device, the non-textual data toadditional non-textual data indicative of visual characteristics ofknown authentic items; determining, by the at least one computingdevice, whether the item being listed is authentic based on thecomparing; and responsive to a determination that the item being listedis authentic, surfacing, by the at least one computing device, anindication that a listing of the item on the listing service is allowedto include an authentic designation; or responsive to a determinationthat the item being listed is not authentic, surfacing, by the at leastone computing device, an indication that the listing of the item on thelisting service is not allowed to include the authentic designation. 2.The method as described in claim 1, wherein the authenticationinstructions specify a location of the item being listed to capture inthe digital visual content.
 3. The method as described in claim 1,wherein the authentication instructions specify a feature of the itembeing listed to capture in the digital visual content.
 4. The method asdescribed in claim 1, further comprising receiving an additionalselection to surface the listing of the item with the authenticdesignation via the listing service.
 5. The method as described in claim1, wherein the digital visual content depicting the one or more visualcharacteristics of the item being listed comprises digital video.
 6. Themethod as described in claim 1, wherein the digital visual contentdepicting the one or more visual characteristics of the item beinglisted comprises at least one digital image.
 7. The method as describedin claim 1, wherein the additional non-textual data is at least onefeature vector.
 8. A system comprising: one or more processors; andmemory having stored thereon computer-readable instructions that areexecutable by the one or more processors to perform operationscomprising: displaying a prompt via a user interface havingauthentication instructions for providing digital visual contentdepicting one or more visual characteristics of an item being listed ona listing service, the authentication instructions instructing a user toprovide the digital visual content of a specified portion of the itembeing listed depicting movement of the item being listed; receiving thedigital visual content that depicts the one or more visualcharacteristics of the item being listed; generating non-textual dataindicative of the one or more visual characteristics of the item fromthe digital visual content; comparing the non-textual data to additionalnon-textual data indicative of visual characteristics of known authenticitems; determining whether the item being listed is authentic based onthe comparing; and controlling presentation of an authentic designationwith a listing of the item according to the determining.
 9. The systemas described in claim 8, wherein the additional non-textual data is atleast one feature vector.
 10. The system as described in claim 8,wherein controlling the presentation of the authentic designation withthe listing includes allowing the item to be listed via the listingservice with the authentic designation responsive to a determinationthat the item is authentic.
 11. The system as described in claim 8,wherein controlling the presentation of the authentic designation withthe listing includes preventing the item from being listed via thelisting service with the authentic designation responsive to adetermination that the item is not authentic.
 12. The system asdescribed in claim 8, wherein the listing service prevents items frombeing listed with the authentic designation absent a determination thatthe items are authentic.
 13. The system as described in claim 8, whereinthe operations further comprise receiving a selection via the userinterface to list the item on the listing service with the authenticdesignation.
 14. The system as described in claim 8, further comprisingreceiving an additional selection to surface the listing of the itemwith the authentic designation via the listing service.
 15. A methodimplemented by at least one computing device, the method comprising:displaying a prompt via a user interface having authenticationinstructions for providing digital visual content depicting one or morevisual characteristics of an item being listed on a listing service;receiving the digital visual content that depicts the one or more visualcharacteristics of the item being listed, the authenticationinstructions instructing a user to provide the digital visual content ofa specified portion of the item being listed depicting movement of theitem being listed; generating non-textual data indicative of the one ormore visual characteristics of the item from the digital visual content;comparing the non-textual data to additional non-textual data indicativeof visual characteristics of known authentic items; determining whetherthe item being listed is authentic based on the comparing; andcontrolling presentation of an authentic designation with a listing ofthe item according to the determining.
 16. The method as described inclaim 15, wherein the additional non-textual data includes at least onefeature vector.
 17. The method as described in claim 15, wherein thelisting service prevents items from being listed with the authenticdesignation absent a determination that the items are authentic.
 18. Themethod as described in claim 15, wherein the authentication instructionsinstruct the user to capture digital video of the item being listed. 19.The method as described in claim 15, wherein the authenticationinstructions instruct the user to capture at least one digital image ofthe item being listed.
 20. The method as described in claim 15, whereinthe authentication instructions specify at least one of: a location ofthe item being listed to capture in the digital visual content; or afeature of the item being listed to capture in the digital visualcontent.